ESDF: Ensemble Selection using Diversity and Frequency
نویسندگان
چکیده
Recently ensemble selection for consensus clustering has emerged as a research problem in Machine Intelligence. Normally consensus clustering algorithms take into account the entire ensemble of clustering, where there is a tendency of generating a very large size ensemble before computing its consensus. One can avoid considering the entire ensemble and can judiciously select few partitions in the ensemble without compromising on the quality of the consensus. This may result in an efficient consensus computation technique and may save unnecessary computational overheads. The ensemble selection problem addresses this issue of consensus clustering. In this paper, we propose an efficient method of ensemble selection for a large ensemble. We prioritize the partitions in the ensemble based on diversity and frequency. Our method selects top K of the partitions in order of priority, where K is decided by the user. We observe that considering jointly the diversity and frequency helps in identifying few representative partitions whose consensus is qualitatively better than the consensus of the entire ensemble. Experimental analysis on a large number of datasets shows our method gives better results than earlier ensemble selection methods. Keywords— Cluster analysis, consensus clustering, data clustering, ensemble selection methods
منابع مشابه
The ensemble clustering with maximize diversity using evolutionary optimization algorithms
Data clustering is one of the main steps in data mining, which is responsible for exploring hidden patterns in non-tagged data. Due to the complexity of the problem and the weakness of the basic clustering methods, most studies today are guided by clustering ensemble methods. Diversity in primary results is one of the most important factors that can affect the quality of the final results. Also...
متن کاملWised Semi-Supervised Cluster Ensemble Selection: A New Framework for Selecting and Combing Multiple Partitions Based on Prior knowledge
The Wisdom of Crowds, an innovative theory described in social science, claims that the aggregate decisions made by a group will often be better than those of its individual members if the four fundamental criteria of this theory are satisfied. This theory used for in clustering problems. Previous researches showed that this theory can significantly increase the stability and performance of...
متن کاملWised Semi-Supervised Cluster Ensemble Selection: A New Framework for Selecting and Combing Multiple Partitions Based on Prior knowledge
The Wisdom of Crowds, an innovative theory described in social science, claims that the aggregate decisions made by a group will often be better than those of its individual members if the four fundamental criteria of this theory are satisfied. This theory used for in clustering problems. Previous researches showed that this theory can significantly increase the stability and performance of...
متن کاملEnsemble Classification and Extended Feature Selection for Credit Card Fraud Detection
Due to the rise of technology, the possibility of fraud in different areas such as banking has been increased. Credit card fraud is a crucial problem in banking and its danger is over increasing. This paper proposes an advanced data mining method, considering both feature selection and decision cost for accuracy enhancement of credit card fraud detection. After selecting the best and most effec...
متن کاملAnalysis of Methods and Strategies for Diversity Based Genetation of Classifier Ensemble
A classifier ensemble is a group of individual base classifiers. Each classifier is trained individually by modifying the given data set to achieve diversity. During the testing phase the results given by each classifier are collected to give the final result using a technique called as majority voting. Empirical results prove that diversity amongst the base classifiers improves the accuracy of...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1508.04333 شماره
صفحات -
تاریخ انتشار 2014